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1.0  INTRODUCTION 


Over  the  past  decade,  there  has  been  considerable  interest  in  the  use  of  virtual  audio 
displays  to  reduce  operator  workload  in  challenging  working  environments,  such  as  aircraft 
cockpits.  These  virtual  audio  displays  use  digital  signal  processing  technology  to  add  spatial 
information  to  sound  sources  presented  over  headphones.  The  spatial  information  is  encoded 
in  the  Head-Related  Transfer  Function  (HRTF),  which  represents  the  transformations  that 
occur  in  a  sound  wave  as  it  propagates  from  a  distant  sound  source  to  the  left  and  right 
ears  of  a  listener.  Specifically,  the  HRTF  encodes  three  types  of  spatial  information:  1)  The 
interaural  time  delay  between  the  arrival  of  the  sound  at  the  closer  ear  and  the  arrival  of  the 
sound  and  the  more  distant  ear;  2)  The  interaural  intensity  difference,  caused  by  the  acoustic 
shadowing  of  the  sound  by  the  head  at  the  more  distant  ear;  3)  Direction-dependent  spectral 
cues  caused  by  the  filtering  of  the  outer  ear  or  pinna.  An  extensive  review  of  auditory  display 
technology  is  provided  by  Wenzel  (1991). 

Many  researchers  have  noted  that  the  direction-dependent  properties  of  the  pinna,  which 
are  known  to  be  extremely  important  in  determining  the  elevation  of  a  sound  source,  depend 
on  the  complicated  geometry  of  the  outer  ear  and  vary  substantially  across  listeners  (Butler, 
1987).  Thus,  it  believed  that  listeners  can  localize  more  accurately  with  HRTFs  collected 
from  their  own  ears  than  with  a  generic  HRTF  collected  from  another  person’s  ears  (Wenzel, 
Arruda,  Kistler,  &  Wightman,  1993).  However,  the  collection  of  HRTFs  can  be  a  tedious 
process  that  requires  an  elaborate  experimental  apparatus  and  access  to  an  anechoic  chamber. 

This  report  evaluates  the  SNAPSHOT  MX,  a  commercial  system  that  was  designed  to 
allow  rapid  collection  of  individualized  HRTFs  in  relatively  small,  non-anechoic  rooms.  The 
system  is  designed  to  produce  HRTF  files  that  can  be  implemented  with  the  Convolvotron 
virtual  audio  display  system.  The  first  section  provides  a  detailed  description  of  the  SNAP¬ 
SHOT.  The  second  section  describes  the  results  of  a  psychoacoustic  validation  study  that 
compares  auditory  localization  performance  with  (1)  individualized  HRTFs  collected  with  the 
SNAPSHOT  system;  (2)  a  generic  set  of  transfer  functions  provided  with  the  Convolvotron; 
and  (3)  a  generic  set  of  transfer  functions  measured  with  the  SNAPSHOT. 
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2.0  THE  SNAPSHOT  SYSTEM 


2.1  Origins  of  the  SNAPSHOT  system 

The  SNAPSHOT  is  a  product  of  Crystal  River  Engineering,  a  subsidiary  of  Aureal  Semi¬ 
conductor  located  in  Fremont,  California.  The  system  tested  in  this  evaluation,  the  SNAP 
SHOT  MX  version  1.3,  was  produced  in  1996.  The  SNAPSHOT  MX  v  1.3  was  apparently  a 
pre-production  version  of  the  SNAPSHOT  system,  as  the  manuals  provided  are  incomplete 
and  marked  DRAFT,  and  many  of  the  software  components  described  in  the  manual  are 
non-functional.  The  SNAPSHOT  product  line,  along  with  the  related  Convolvotron  product 
line,  has  been  discontinued  by  Aureal  Semiconductor  and  technical  support  for  the  system 
is  no  longer  available. 


2.2  Description  of  the  SNAPSHOT  system 

The  SNAPSHOT  system  consists  of  three  major  components: 


Alphatron  EL  Workstation:  The  Alphatron  EL  workstation  is  a  signal  processing  sys¬ 
tem  based  on  a  Pentium  PC  running  Windows95  (Figure  1).  The  Alphatron  EL  is  equipped 
with  a  proprietary  sound  card  with  two  D/A  output  channels  and  two  A/D  input  channels, 
and  a  standard  commercial  sound  card  with  two  A/D  inputs  and  two  D/A  outputs.  The 
software  for  the  SNAPSHOT  system,  which  was  preloaded  at  the  factory,  is  written  in  MAT- 
LAB  and  in  pre-compiled  MATLAB  MEX  modules.  The  Alphatron  workstation  is  the  heart 
of  the  SNAPSHOT  system,  responsible  for  signal  generation  and  processing. 


Speaker  and  Speaker  Stand:  The  SNAPSHOT  uses  a  Bose  Acoustimass  speaker,  at¬ 
tached  to  an  adjustable  height  stand,  to  make  the  HRTF  measurements  (Figure  2).  The 
stand  is  approximately  1.8  m  tall  and  can  be  used  to  place  the  loudspeaker  at  elevation 
angles  ranging  from  -36°  to  -1-54°.  A  Sony  commercial  receiver/amplifier  is  provided  for 
driving  the  loudspeaker. 
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Figure  1:  Alphatron  EL  signal  processing  workstation. 


Figure  2:  Adjustable  speaker  and  stand. 
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Figure  3:  In-ear  microphones  shown  with  the  microphone  power  supply  (top)  and  inserted 
in  the  subject’s  ears  (bottom). 


Blocked  Meatus  Microphones:  The  SNAPSHOT  includes  a  pair  of  small  microphones 
that  are  placed  in  the  ears  of  the  subject  during  the  SNAPSHOT  measurements  (See  Fig¬ 
ure  3).  The  microphones  are  designed  to  be  manually  placed  in  the  ear  canal  opening  and 
to  completely  block  the  ear  canal.  This  type  of  HRTF  measurement  is  known  as  a  blocked- 
meatus  measurement  (Moller,  Sorensen,  Hammershoi,  &  Jensen,  1995). 


2.3  Transfer  function  measurement  with  the  SNAPSHOT  system 

The  SNAPSHOT  system  uses  a  transfer  function  measurement  technique  based  on  Go- 
lay  codes.  Golay  codes  are  special  pairs  of  binary  sequences  A  and  B  such  that  the  sum 
of  the  autocorrelation  of  sequence  A  and  the  autocorrelation  of  sequence  B  has  only  a  sin¬ 
gle  non-zero  value  (i.e.  is  an  impulse).  The  use  of  Golay  codes  allows  a  very  rapid  HRTF 
measurement  with  an  acceptable  signal-to-noise  ratio.  This  is  achieved  by  playing  the  com¬ 
plementary  Golay  sequences  A  and  B  out  of  the  loudspeaker  (allowing  sufficient  time  between 
the  two  sequences  for  the  sound  field  to  decay),  recording  the  responses  to  the  sequences 


Figure  4:  Transfer  function  collection  with  the  SNAPSHOT  system. 


from  each  microphone,  convolving  each  output  sequence  the  time-reversed  version  of  the  in¬ 
put  sequence,  and  summing  the  two  convolved  sequences  to  determine  the  impulse  response 
of  the  complete  system  (sound  card,  amplifier,  loudspeaker,  HRTF,  and  microphone).  [The 
Golay-code  based  HRTF  collection  process  in  described  in  more  detail  in  Foster  (1986).] 

The  SNAPSHOT  system  was  specifically  designed  to  measure  the  HRTF  in  reverberant 
rooms.  This  is  achieved  by  time-windowing  the  measured  impulse  responses  to  capture  the 
part  of  the  impulse  response  related  to  the  direct  sound  path  from  the  speaker  to  the  head 
and  to  ignore  the  later  parts  of  the  impulse  response  caused  by  sound  reflections  off  the  walls 
of  the  room. 


2.4  SNAPSHOT  measurement  procedure 

The  SNAPSHOT  procedure  allows  a  single  experimenter  to  collect  a  set  of  72  HRTFs 
from  a  subject  in  less  than  30  minutes.  The  procedure  can  be  summarized  as  follows: 

1.  The  subject  is  examined  with  an  otoscope  and  the  microphones  are  wrapped  in  a  seal 
constructed  of  acoustic  foam  and  inserted  into  the  subject’s  ear  canals.  A  velcro  band 
wrapped  around  the  subject’s  head  holds  the  microphone  leads  in  place  and  prevents 
them  from  pulling  out  the  microphones  (Figure  3). 

2.  The  subject  is  seated  on  a  swivel  chair  located  1.0  m  from  the  loudspeaker.  The 
SNAPSHOT  requires  the  area  in  the  room  within  0.7  m  of  the  loudspeaker  and  the 
subject  to  be  free  of  reflecting  objects  in  order  to  eliminate  reverberant  sound  from  the 
transfer  function  measurement. 


3.  The  HRTFs  are  collected  by  rotating  the  swivel  chair  through  12  positions  in  azimuth 
(every  30°)  and  measuring  the  left  and  right  HRTFs  at  each  position.  The  12  positions 
are  repeated  at  each  of  six  different  speaker  heights,  ranging  from  54°  in  elevation  to 
-36°  in  18°  increments.  Thus,  a  total  of  72  HRTF  measurements  are  made  for  each 
ear. 

4.  The  SNAPSHOT  processes  the  measured  HRTFs  and  archives  them  into  a  “.AHM” 
file,  a  proprietary  format  for  use  in  the  Crystal  River  Engineering  Convolvotron  audio 
display  system.  This  processing  in  not  described  in  any  detail  in  the  SNAPSHOT 
manual,  but  it  apparently  is  designed  to  remove  the  characteristics  of  the  microphone 
and  speaker  as  well  as  the  reverberation  characteristics  of  the  room. 
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3.0  PSYCHOPHYSICAL  EVALUATION  OF 
SNAPSHOT  MEASUREMENT  SYSTEM 


In  order  to  evaluate  the  effectiveness  of  the  SNAPSHOT  system  for  collecting  accurate 
HRTF  measurements,  a  simple  psychoacoustic  experiment  was  performed.  In  this  experi¬ 
ment,  the  localization  accuracy  of  three  subjects  was  tested  with  the  “SDO”  HRTFs  provided 
with  the  Convolvotron,  with  their  own  HRTFs  measured  with  the  SNAPSHOT,  and  with 
measurements  made  on  a  KEMAR  manikin  with  the  SNAPSHOT. 

3.1  Methods 


3.1.1  Subjects 


Two  males  and  one  female,  ages  20-35,  served  as  subjects  in  the  experiment.  All  reported 
normal  hearing  in  both  ears.  Two  of  the  subjects  had  considerable  experience  with  the 
experimental  procedure  from  participating  in  an  unrelated  localization  experiment  (using 
the  same  paradigm  and  response  method)  over  the  previous  two  months.  One  of  the  subjects 
had  no  experience  in  auditory  localization  studies. 


3.1.2  HRTF  measurements 

The  SNAPSHOT  measurement  system  was  used  to  measure  a  full  set  of  72  HRTFs 
on  each  of  the  three  subjects  using  the  procedure  outlined  in  the  previous  section.  In 
addition,  a  full  set  of  HRTF  measurements  was  made  on  a  KEMAR  acoustic  manikin  with  the 
microphones  inserted  in  the  ear  canal  openings  of  the  rubber  pinnae  provided  with  KEMAR. 
The  SNAPSHOT  system  was  used  to  archive  the  HRTFs  in  the  “.AHM”  file  format  for  use 
with  the  Convolvotron  system.  In  addition,  the  “SDO”  HRTF  files,  which  were  measured 
originally  by  Wightman  and  Kistler  (1989)  and  are  provided  with  the  Convolvotron,  were 
used  as  a  control. 
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3.1.3  Stimulus 


The  auditory  stimulus  used  in  the  experiment  consisted  of  short  (300ms)  bursts  of  pink 
noise  presented  binaurally  over  headphones.  The  noise,  produced  by  a  GenRad  1382  noise 
generator,  was  used  as  an  input  signal  for  the  Convolvotron,  which  added  localization  cues  to 
the  signal  by  convolving  the  noise  with  the  HRTFs  stored  in  the  “.ARM”  files.  The  output  of 
the  Convolvotron  was  then  presented  to  the  subject  through  Sennheiser  HD520  headphones. 


3.1.4  Response 


The  GELP  (God’s  Eye  Localization  Pointing)  method  was  used  to  collect  subject  re¬ 
sponses  (Gilkey,  Good,  Ericson,  Brinkman,  &  Stewart,  1995).  In  the  GELP  method,  the 
subject  is  asked  to  use  a  stylus  to  point  to  the  location  on  the  surface  of  a  solid  plastic 
sphere  that  best  matches  the  perceived  direction  of  the  sound  source.  Gilkey’s  validation 
experiments  have  shown  that  this  method  produces  a  localization  error  of  approximately  10° 
when  responding  to  verbal  coordinates  (azimuth  and  elevation)  and  of  approximately  20° 
when  responding  to  a  free-held  directional  sound  source. 


3.1.5  Procedure 

In  each  experimental  session,  the  subject  responded  to  a  total  of  185  stimuli  located  at 
37  different  positions  randomly  distributed  over  the  surface  of  the  sphere.  Each  subject 
participated  in  a  total  of  three  sessions.  In  the  hrst  session,  the  subjects  listened  to  the 
“SDO”  HRTFs.  In  the  second  session,  they  listened  to  their  own  HRTFs  previously  measured 
with  the  SNAPSHOT  system.  In  the  third  session,  they  listened  to  the  KEMAR  HRTFs. 
Each  session  took  approximately  20  minutes  to  complete,  and  the  subjects  were  given  a  short 
break  between  sessions. 


3.2  Results 


3.2.1  Front-back  reversals 

In  this  experiment,  as  in  previous  localization  experiments  without  head  motion,  the 
subjects  experienced  a  number  of  front-back  reversals.  A  front-back  reversal  occurs  when  a 
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ABC  MEAN 


Subject 

Figure  5:  Percentage  of  front-back  reversals  for  each  subject  and  each  type  of  HRTF.  The 
error  bars  show  the  95%  confidence  interval. 


listener  perceives  a  sound  source  at  the  mirror  image  of  its  true  location  across  the  frontal 
plane.  For  example,  a  sound  source  at  45°  in  azimuth  might  be  perceived  at  135°.  These 
front-back  reversals  occur  because  the  dominant  interaural  cues  (interaural  time  delay  and 
interaural  intensity  difference)  are  approximately  cylindrically  symmetric  around  the  inter¬ 
aural  axis  of  the  head  (Wallach,  1940),  and  thus  cannot  be  used  to  distinguish  between 
symmetric  source  locations  in  the  front  and  rear  hemispheres. 

Front-back  confusions  occurred  in  approximately  30%  of  the  trials  overall  (Figure  5). 
Each  of  the  three  subjects  reversed  the  smallest  number  of  trials  with  the  SDO  transfer 
functions,  and  the  largest  number  of  trials  with  the  individualized  transfer  functions.  How¬ 
ever,  a  two-factor  analysis  of  variance  (ANOVA)  indicates  that  neither  the  main  effects  of 
subject  nor  HRTF  type  were  significant  at  the  p<0.05  level.  The  consistency  of  the  pat¬ 
tern  of  performance  across  subjects  is  confirmed  by  the  lack  of  any  interaction  between  the 
subject  and  HRTF  factors  (^4,1629  =  0.04509). 

Trials  where  front-back  confusions  occurred  were  “corrected”  by  reflecting  the  response 
locations  across  the  frontal  plane  prior  to  analyzing  the  data  for  directional  errors. 
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Subject 

Figure  6:  Overall  angular  error.  Front-back  reversals  have  been  corrected  in  these  data.  The 
error  bars  show  the  95%  confidence  intervals. 


3.2.2  Angular  errors 

The  angular  error  is  defined  as  the  angle  of  arc  between  the  stimulus  vector  (the  vector 
from  the  origin  to  the  stimulus  location)  and  the  response  vector  (the  vector  from  the  origin  to 
the  response  location).  The  angular  error  represents  the  overall  error  between  the  stimulus 
and  response  locations  and  includes  the  effects  of  both  azimuth  and  elevation  errors.  In 
this  experiment,  the  overall  angular  error  was  consistently  lowest  with  the  SDO  transfer 
functions,  and  highest  with  the  KEMAR  transfer  functions  (Figure  6).  A  two-factor  repeated 
measures  ANOVA  indicates  that  the  main  effects  of  subject  and  transfer  function  type  are 
both  significant  at  the  p  <0.05  level. 

3.2.3  Azimuth  and  elevation  errors 

The  relative  contributions  of  the  errors  in  azimuth  and  elevation  can  be  determined 
from  the  standard  deviation  of  the  azimuth  error  and  the  elevation  error.  The  standard 
deviation  is  a  useful  measure  because  it  ignores  the  effects  of  systematic  bias  in  the  results 
and  concentrates  only  on  response  variability.  The  results  indicate  that  most  of  the  variation 
in  error  across  transfer  functions  is  a  result  of  errors  in  the  azimuth  of  the  sound  source  rather 
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than  the  elevation  of  the  sound  source  (Figure  7). 


3.3  Discussion 


3.3.1  Poor  overall  performance 


The  overall  localization  performance  measured  in  this  experiment  was  very  poor  in  each 
of  the  three  conditions  tested.  Previous  experiments  in  the  free  field  and  with  virtual  audio 
displays  have  shown  that  human  localization  performance  is  substantially  better  than  was 
indicated  in  this  experiment.  For  example,  Wightman  and  Kistler  (1989)  found  that  the 
mean  overall  angular  error  was  approximately  21.3°  with  a  free-field  stimulus  and  22.3° 
with  a  virtual  stimulus,  compared  to  an  average  error  of  35.5°  in  the  best  condition  of  this 
experiment.  Similarly,  Wightman  and  Kistler  reported  reversals  in  only  5.6%  of  trials  with 
a  free-field  stimulus  and  9.6%  of  trials  with  a  virtual  stimulus,  compared  to  more  than  29% 
of  trials  in  the  best  condition  of  this  experiment. 

The  poor  performance  observed  in  this  experiment  cannot  be  attributed  to  the  GELP 
response  method,  which  has  been  shown  to  produce  angular  errors  of  only  18.2°  with  a  free- 
field  stimulus  (Gilkey  et  al.,  1995).  The  errors  do  not  appear  to  be  a  result  of  a  systematic 
failure  of  the  experimental  procedure,  as  there  were  significant  differences  in  subject  per¬ 
formance  across  the  three  conditions  and  the  relative  ordering  of  performance  in  the  three 
conditions  was  identical  across  the  three  subjects.  Training  does  not  appear  to  be  an  issue 
either.  Two  of  the  subjects  (A  and  B)  had  considerable  experience  with  the  experimental 
procedure  prior  to  their  experiment,  and  while  the  third  subject  (C)  performed  considerably 
worse  than  the  two  experienced  subjects,  the  relative  results  in  the  three  conditions  were 
similar.  The  performance  may  be  attributable  to  the  interpolation  algorithm  used  by  the 
Convolvotron.  Although  the  Convolvotron  is  often  cited  in  the  literature,  we  were  unable 
to  identify  a  single  study  validating  its  performance  in  a  location  identification  experiment 
without  head-tracking. 

Note  that,  while  overall  performance  was  poor,  the  differences  between  performance  in 
the  three  experimental  conditions  were  significant.  Furthermore,  these  differences  cannot  be 
attributed  to  training  effects  because  the  best  performance  was  always  seen  with  the  SDO 
transfer  functions  that  were  presented  first  for  each  of  the  subjects.  Therefore,  although 
overall  performance  was  poor,  the  results  can  still  be  used  to  compare  performance  across 
the  three  types  of  HRTFs. 
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Figure  7:  Standard  deviation  of  error  in  azimuth  (top)  and  elevation  (bottom).  Front-back 
reversals  have  been  corrected  in  the  azimuth  data. 
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3.3.2  Evaluation  of  the  HRTF  measurements  with  the  SNAPSHOT 


The  data  show  that  the  individualized  HRTFs  measured  with  the  SNAPSHOT  system 
failed  to  provide  any  performance  benefit  over  the  generic  SDO  transfer  functions  that  are 
supplied  with  the  SNAPSHOT  system.  In  fact,  the  individualized  transfer  functions  mea¬ 
sured  with  the  SNAPSHOT  system  produced  significantly  larger  angular  errors  than  the 
SDO  transfer  functions  (45°  vs.  35°)  and  consistently  produced  a  larger  number  of  front- 
back  reversals  than  SDO’s  transfer  functions. 

The  individualized  HRTFs  measured  with  the  SNAPSHOT  did,  however,  produce  signif¬ 
icantly  better  performance  than  the  KEMAR  HRTFs  measured  with  the  SNAPSHOT.  This 
is  consistent  with  previous  studies  that  have  indicated  that  listeners  can  localize  more  accu¬ 
rately  with  HRTFs  measured  from  their  own  ears  than  with  generic  HRTFs  (Wenzel,  1991). 
However,  it  is  striking  that  the  majority  of  the  degradation  in  directional  accuracy  with  the 
KEMAR  HRTFs  is  in  azimuth  (Figure  7),  rather  than  elevation,  while  most  previous  studies 
have  indicated  that  individualized  HRTFs  provide  the  greatest  benefit  to  elevation  perfor¬ 
mance  (Wenzel,  1991).  Furthermore,  the  number  of  front-back  reversals  was  greater  with  the 
individualized  HRTFs  than  with  the  KEMAR  HRTFs,  while  most  previous  studies  indicate 
that  individual  pinnae  cues  play  an  important  role  in  disambiguating  front-back  confusions 
(Wenzel  et  ah,  1993).  It  should  be  noted  that  the  subjects  exhibited  a  surprisingly  large 
number  of  left-right  reversals  with  the  KEMAR  manikin  HRTFs,  indicating  that  either  the 
interaural  time  delay  or  the  interaural  intensity  difference  was  not  correctly  represented  by 
the  KEMAR  HRTFs  measured  with  the  snapshot  system. 

3.3.3  Explanations  for  poor  performance  with  the  SNAPSHOT 

The  most  likely  explanation  for  the  extremely  poor  performance  with  the  individualized 
HRTFs  (and  the  KEMAR  HRTFs)  is  a  failure  of  the  SNAPSHOT  system  to  adequately  mea¬ 
sure  the  HRTFs.  The  HRTFs  were  measured  carefully  according  to  the  procedure  outlined 
in  the  SNAPSHOT  manual.  However,  there  was  some  difficulty  in  adequately  placing  the 
microphones  in  the  ear  canals  of  the  listeners.  In  fact,  HRTF  measurements  were  initially 
made  with  four  subjects  rather  than  three,  but  the  fourth  subject  had  to  be  thrown  out 
because  his  HRTFs  were  greatly  attenuated  in  the  right  ear  due  to  improper  placement  of 
the  microphone. 

Another  possible  reason  for  poor  localization  accuracy  with  the  SNAPSHOT  HRTFs  is 
that  the  SDO  transfer  functions  include  more  points  than  the  SNAPSHOT  measurements 
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(144  vs.  72).  This  requires  more  aggressive  interpolation  of  the  transfer  functions  with  the 
SNAPSHOT  HRTFs,  and  may  result  in  less  accurate  localization  with  the  HRTFs  measured 
with  the  SNAPSHOT  system.  Also,  the  SDO  measurements  were  made  with  a  spherical 
coordinate  system  rather  than  the  cylindrical  coordinate  system  required  with  the  SNAP¬ 
SHOT. 

Note  that  the  problems  with  the  KEMAR  manikin  transfer  functions  may  result  from  the 
difficulty  in  placing  the  SNAPSHOT  microphones,  which  are  designed  to  fit  into  a  human 
ear  canal,  into  the  rubber  pinnae  of  the  manikin.  These  rubber  pinnae  have  a  very  shallow 
ear  canal  and  may  not  have  properly  accommodated  the  SNAPSHOT  microphones. 
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4.0  CONCLUSIONS  AND  RECOMMENDATIONS 


Based  on  this  experiment,  the  SNAPSHOT  system  cannot  be  recommended  for  use  as  a 
tool  to  collect  individualized  head-related  transfer  functions.  Although  it  does  allow  rapid 
collection  of  HRTFs  (72  positions  in  less  than  30  minutes)  it  has  a  number  of  serious  draw¬ 
backs; 

1.  The  HRTFs  collected  with  the  SNAPSHOT  system  produce  consistently  less  accu- 
ra,te  localization  performance  than  the  generic  SDO  HRTFs  provided  with  the  Con- 
volvotron.  This  difference  was  significant  for  the  individualized  HRTFs  collected  on 
three  different  subjects.  On  average,  the  angular  error  was  25%  larger  with  the  indi¬ 
vidualized  SNAPSHOT  HRTFs  than  with  the  SDO  HRTFs.  There  were  also  a  greater 
number  of  front-back  reversals  with  the  individualized  HRTFs.  Thus,  although  the 
SNAPSHOT  allows  rapid  collection  of  individualized  HRTFs,  these  results  suggest 
that  the  HRTFs  collected  with  the  SNAPSHOT  are  so  poor  in  quality  that  any  advan¬ 
tages  of  individualized  vs.  generic  HRTFs  are  lost. 

2.  The  processing  performed  by  the  SNAPSHOT  when  the  measured  HRTFs  are  stored 
as  acoustic  head  maps  (.AHM  files)  for  use  with  the  Convolvotron  is  not  clear  from  the 
documentation  provided.  The  manual  suggests  that  this  processing  eliminates  room 
reflections  and  the  frequency  response  of  the  speaker  from  the  measurements.  However, 
no  details  are  given.  Furthermore,  the  “.AHM”  file  format  is  a  proprietary  format  for 
the  Convolvotron,  and  once  the  HRTFs  are  stored  in  this  format  it  is  impossible  to 
examine  them.  The  functions  for  the  Alphatron  that  are  supposed  to  allow  examination 
of  the  “.AHM”  files  (described  in  the  manual)  do  not  exist.  Thus  is  impossible  to 
verify  that  the  HRTF  files  have  been  properly  collected  without  downloading  them  to 
a  Convolvotron  system. 

3.  The  available  documentation  of  the  system  is  incomplete.  Only  a  draft  manual  is 
available,  which  lacks  figures  and  references.  Furthermore,  entire  sections  are  missing 
from  the  manual  and  many  of  the  functions  documented  in  the  manual  are  not  available 
on  the  Alphatron  workstation. 
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